Applying Ant Colony Optimization to configuring stacking ensembles for data mining

نویسندگان

  • YiJun Chen
  • Man Leung Wong
  • Haibing Li
چکیده

An ensemble is a collective decision-making system which applies a strategy to combine the predictions of learned classifiers to generate its prediction of new instances. Early research has proved that ensemble classifiers in most cases can be more accurate than any single component classifier both empirically and theoretically. Though many ensemble approaches are proposed, it is still not an easy task to find a suitable ensemble configuration for a specific dataset. In some early works, the ensemble is selected manually according to the experience of the specialists. Metaheuristic methods can be alternative solutions to find configurations. Ant Colony Optimization (ACO) is one popular approach among metaheuristics. In this work, we propose a new ensemble construction method which applies ACO to the stacking ensemble construction process to generate domain-specific configurations. A number of experiments are performed to compare the proposed approach with some well-known ensemble methods on 18 benchmark data mining datasets. The approach is also applied to learning ensembles for a real-world cost-sensitive data mining problem. The experiment results show that the new approach can generate better stacking ensembles. Over years of development, it has become more and more difficult to improve significantly the performance of a single classifier. Recently, there has been growing research interest in the method to combine different classifiers together to achieve better performance. The combining method is referred to as Ensemble. In early research, ensembles were proved empirically and theoretically to perform more accurately than any single component classifier in most cases. If an ensemble is generated by a set of classifiers which are trained from the same learning algorithm, this ensemble is a homogeneous ensemble. If an ensemble is generated by a set of classifiers, which are trained from different learning algorithms, this ensemble is a heterogeneous ensemble (Dietterich, 2000). For example, Bagging (Breiman, 1996) and Boosting (Schapire, 1990) are homogeneous ensembles, while stacking (Wolpert, 1992) is a heterogeneous ensemble. To generate an ensemble to achieve expected results, two important things should be considered carefully. The first is to introduce enough diversity into the components of an ensemble. The second is to choose a suitable combining method to combine the diverse outputs to a single output (Polikar, 2006). The diversity is the foundation of an ensemble. However, as the diversity increases , the marginal effect decreases after a certain threshold. The memories and computing cost increase significantly while the performance does not improve steadily. …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hybrid ANFIS with ant colony optimization algorithm for prediction of shear wave velocity from a carbonate reservoir in Iran

Shear wave velocity (Vs) data are key information for petrophysical, geophysical and geomechanical studies. Although compressional wave velocity (Vp) measurements exist in almost all wells, shear wave velocity is not recorded for most of elderly wells due to lack of technologic tools. Furthermore, measurement of shear wave velocity is to some extent costly. This study proposes a novel methodolo...

متن کامل

Noisy images edge detection: Ant colony optimization algorithm

The edges of an image define the image boundary. When the image is noisy, it does not become easy to identify the edges. Therefore, a method requests to be developed that can identify edges clearly in a noisy image. Many methods have been proposed earlier using filters, transforms and wavelets with Ant colony optimization (ACO) that detect edges. We here used ACO for edge detection of noisy ima...

متن کامل

A systematic approach for estimation of reservoir rock properties using Ant Colony Optimization

Optimization of reservoir parameters is an important issue in petroleum exploration and production. The Ant Colony Optimization(ACO) is a recent approach to solve discrete and continuous optimization problems. In this paper, the Ant Colony Optimization is usedas an intelligent tool to estimate reservoir rock properties. The methodology is illustrated by using a case study on shear wave velocity...

متن کامل

Diagnosis of the disease using an ant colony gene selection method based on information gain ratio using fuzzy rough sets

With the advancement of metagenome data mining science has become focused on microarrays. Microarrays are datasets with a large number of genes that are usually irrelevant to the output class; hence, the process of gene selection or feature selection is essential. So, it follows that you can remove redundant genes and increase the speed and accuracy of classification. After applying the gene se...

متن کامل

Applying Ant Colony Optimization for Inducing Decision Tree from Mixed Type of Relational Multiclass Data

Decision trees have been widely used in data mining and machine learning as a comprehensible knowledge representation. Originally the method to construct decision tree follows greedy approach which results in local tree and classification rules. Ant Colony Optimization (ACO) is a metaheuristic approach used to get more optimal solution compared to other methods from large search space. In propo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Expert Syst. Appl.

دوره 41  شماره 

صفحات  -

تاریخ انتشار 2014